Search for: All records

Creators/Authors contains: "Yousefi, M."

« Prev Next »

Total Resources

3

Resource Type
Conference Paper

1

Conference Proceeding

1

Dataset

0

Journal Article

1

Workshop Report

0

Availability
Full Text / Resource Available

3

Citation Only

0

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

FEARLESS STEPS: ADVANCEMENTS IN SPEECH TECHNOLOGY AND CORPUS DEVELOPMENT FOR NATURALISTIC AUDIO

Joglekar, A. ; Hansen, J.H.L. ; Yousefi, M. ; Chandra Shekar, M. ; Chen, S.-J. ; Belitz, C. ( February 2023 , NASA Human Research Program Investigators Conference)

INTRODUCTION: CRSS-UTDallas initiated and oversaw the efforts to recover APOLLO mission communications by re-engineering the NASA SoundScriber playback system, and digitizing 30-channel analog audio tapes – with the entire Apollo-11, Apollo-13, and Gemini-8 missions during 2011-17 [1,6]. This vast data resource was made publicly available along with supplemental speech & language technologies meta-data based on CRSS pipeline diarization transcripts and conversational speaker time-stamps for Apollo team at NASA Mission Control Center, [2,4]. In 2021, renewed efforts over the past year have resulted in the digitization of an additional +50,000hrs of audio from Apollo 7,8,9,10,12 missions, and remaining A-13 tapes. Cumulative digitization efforts have enabled the development of the largest publicly available speech data resource with unprompted, real conversations recorded in naturalistic environments. Deployment of this massive corpus has inspired multiple collaborative initiatives such as Web resources ExploreApollo (https://app.exploreapollo.org) LanguageARC (https://languagearc.com/projects/21) [3]. ExploreApollo.org serves as the visualization and play-back tool, and LanguageARC the crowd source subject content tagging resource developed by UG/Grad. Students, intended as an educational resource for k-12 students, and STEM/Apollo enthusiasts. Significant algorithmic advancements have included advanced deep learning models that are now able to improve automatic transcript generation quality, and even extract high level knowledge such as ID labels of topics being spoken across different mission stages. Efficient transcript generation and topic extraction tools for this naturalistic audio have wide applications including content archival and retrieval, speaker indexing, education, group dynamics and team cohesion analysis. Some of these applications have been deployed in our online portals to provide a more immersive experience for students and researchers. Continued worldwide outreach in the form of the Fearless Steps Challenges has proven successful with the most recent Phase-4 of the Challenge series. This challenge has motivated research in low level tasks such as speaker diarization and high level tasks like topic identification. IMPACT: Distribution and visualization of the Apollo audio corpus through the above mentioned online portals and Fearless Steps Challenges have produced significant impact as a STEM education resource for K-12 students as well as a SLT development resource with real-world applications for research organizations globally. The speech technologies developed by CRSS-UTDallas using the Fearless Steps Apollo corpus have improved previous benchmarks on multiple tasks [1, 5]. The continued initiative will extend the current digitization efforts to include over 150,000 hours of audio recorded during all Apollo missions. ILLUSTRATION: We will demonstrate WebExploreApollo and LanguageARC online portals with newly digitized audio playback in addition to improved SLT baseline systems, the results from ASR and Topic Identification systems which will include research performed on the corpus conversational. Performance analysis visualizations will also be illustrated. We will also display results from the past challenges and their state-of-the-art system improvements.
more » « less
Full Text Available
Real-time Speaker counting in a cocktail party scenario using Attention-guided Convolutional Neural Network

Yousefi, M. ; Hansen, J.H.L. ( September 2021 , ISCA INTERSPEECH-2021)
null (Ed.)
Most current speech technology systems are designed to operate well even in the presence of multiple active speakers. However, most solutions assume that the number of co-current speakers is known. Unfortunately, this information might not always be available in real-world applications. In this study, we propose a real-time, single-channel attention-guided Convolutional Neural Network (CNN) to estimate the number of active speakers in overlapping speech. The proposed system extracts higher-level information from the speech spectral content using a CNN model. Next, the attention mechanism summarizes the extracted information into a compact feature vector without losing critical information. Finally, the active speakers are classified using a fully connected network. Experiments on simulated overlapping speech using WSJ corpus show that the attention solution is shown to improve the performance by almost 3% absolute over conventional temporal average pooling. The proposed Attention-guided CNN achieves 76.15% for both Weighted Accuracy and average Recall, and 75.80% Precision on speech segments as short as 20 frames (i.e., 200 ms). All the classification metrics exceed 92% for the attention-guided model in offline scenarios where the input signal is more than 100 frames long (i.e., 1s).
more » « less
Full Text Available
The Influence of the Solid Earth on the Contribution of Marine Sections of the Antarctic Ice Sheet to Future Sea‐Level Change

https://doi.org/10.1029/2021GL097525

Yousefi, M. ; Wan, J. ; Pan, L. ; Gomez, N. ; Latychev, K. ; Mitrovica, J. X. ; Pollard, D. ; DeConto, R. M. ( August 2022 , Geophysical Research Letters)

Abstract
Seismic tomography models indicate highly variable Earth structure beneath Antarctica with anomalously low shallow mantle viscosities below West Antarctica. An improved projection of the contribution of the Antarctic Ice Sheet to sea‐level change requires consideration of this complexity to precisely account for water expelled into the ocean from uplifting marine sectors. Here we build a high‐resolution 3‐D viscoelastic structure model based on recent inferences of seismic velocity heterogeneity below the continent. The model serves as input to a global‐scale sea‐level model that we use to investigate the influence of solid Earth deformation in Antarctica on future global mean sea‐level (GMSL) rise. Our calculations are based on a suite of ice mass projections generated with a range of climate forcings and suggest that water expulsion from the rebounding marine basins contributes 4%–16% and 7%–14% to the projected GMSL change at 2100 and 2500, respectively.

more » « less